Optimal Payoff Functions for Members of Collectives
نویسندگان
چکیده
We consider the problem of designing (perhaps massively distributed) collectives of computational processes to maximize a provided “world” utility function. We consider this problem when the behavior of each process in the collective can be cast as striving to maximize its own payoff utility function. For such cases the central design issue is how to initialize/update those payoff utility functions of the individual processes so as to induce behavior of the entire collective having good values of the world utility. Traditional “team game” approaches to this problem simply assign to each process the world utility as its payoff utility function. In previous work we used the “Collective Intelligence” (COIN) framework to derive a better choice of payoff utility functions, one that results in world utility performance up to orders of magnitude superior to that ensuing from use of the team game utility. In this paper we extend these results using a novel mathematical framework. We review the derivation under that new framework of the general class of payoff utility functions that both i) are easy for the individual processes to try to maximize, and ii) have the property that if good values of them are achieved, then we are assured of a high value of world utility. These are the “Aristocrat Utility” and a new variant of the “Wonderful Life Utility” that was introduced in the previous COIN work. We demonstrate experimentally that using these new utility functions can result in significantly improved performance over that of previously investigated COIN payoff utilities, over and above those previous utilities’ superiority to the conventional team game utility. These results also illustrate the substantial superiority of these payoff functions to the perhaps the most natural version of the economics technique of “endogenizing externalities”.
منابع مشابه
On Memoryless Quantitative Objectives
In two-player games on graph, the players construct an infinite path through the game graph and get a reward computed by a payoff function over infinite paths. Over weighted graphs, the typical and most studied payoff functions compute the limit-average or the discounted sum of the rewards along the path. Besides their simple definition, these two payoff functions enjoy the property that memory...
متن کاملFair and Rational Schemes for Payoff Allocation in Supply Chain Design Problems
Supply chain design problems can be analyzed as cooperative linear production games. The maximal total payoff and the optimal coalition of a “market responsive” supply network can be obtained from the solution of the mixed-variables Linear Programming problem. Then, using duality theory, the “Owen set” can be constructed in order to allocate the payoff among the members of the optimal coalition...
متن کاملPure Stationary Optimal Strategies in Markov Decision Processes
Markov decision processes (MDPs) are controllable discrete event systems with stochastic transitions. Performances of an MDP are evaluated by a payoff function. The controller of the MDP seeks to optimize those performances, using optimal strategies. There exists various ways of measuring performances, i.e. various classes of payoff functions. For example, average performances can be evaluated ...
متن کاملExtensional and intensional collectives and the de re/de dicto distinction
Expressions designating collectives, such as “the committee” or “the ships in the port”, may be interpreted de re or de dicto, depending on context, according as they pick out collectives defined by their members or collectives defined by some criterion for membership. We call these E(xtensional)-collectives and I(ntensional)-collectives respectively, and in this paper we explore in depth the r...
متن کاملA Class of Markov Decision Processes with Pure and Stationary Optimal Strategies
We are interested in the existence of pure and stationary optimal strategies in Markov decision processes. We restrict to Markov decision processes with finitely many states and actions and infinite duration. In a Markov decision process, each state is labelled by an immediate payoff and each infinite history generates a stream of immediate payoffs. The final payoff associated with an infinite ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Advances in Complex Systems
دوره 4 شماره
صفحات -
تاریخ انتشار 2001